ASCOMA: An Adaptive Hybrid Shared Memory Architecture
نویسندگان
چکیده
Scalable shared memory multiprocessors traditionally use either a cache coherent non uniform memory access CC NUMA or simple cache only memory architecture S COMA memory architecture Recently hybrid architectures that combine aspects of both CC NUMA and S COMA have emerged In this paper we present two improvements over other hybrid architectures The rst improvement is a page allocation algorithm that prefers S COMA pages at low memory pressures Once the local free page pool is drained additional pages are mapped in CC NUMA mode until they su er su cient remote misses to warrant upgrading to S COMA mode The second improvement is a page replacement algorithm that dynamically backs o the rate of page remappings from CC NUMA to S COMA mode at high memory pressure This design dramatically reduces the amount of kernel overhead and the number of induced cold misses caused by needless thrashing of the page cache The resulting hybrid architecture is called adaptive S COMA AS COMA AS COMA exploits the best of S COMA and CC NUMA performing like an S COMA machine at low memory pressure and like a CC NUMA machine at high memory pressure AS COMA outperforms CC NUMA under almost all conditions and outperforms other hybrid architectures by up to at low memory pressure and up to at high memory pressure
منابع مشابه
Collective Memory as a Measure to Evaluate the Infill Architecture Innovations in Historic Contexts (Case Study: Historic Context of Imamzadeh Yahya in Tehran)
Historic contexts remind us of an era when cities were built based on the needs, goals, and preferences of their inhabitants. In other words, the mental world of both the builders and the inhabitants was closely interrelated. But by ignoring citizens' memories and interests and their mental needs, today's interventions with rapid developments within historic contexts have led to amnesia and the...
متن کاملHybrid Parallelization Techniques for Lattice Boltzmann Free Surface Flows
In the following, we will present an algorithm to perform adaptive free surface simulations with the lattice Boltzmann method (LBM) on machines with shared and distributed memory architectures. Performance results for different test cases and architectures will be given. The algorithm for parallelization yields a high performance, and can be combined with the adaptive LBM simulations. Moreover,...
متن کاملHybrid MPI-thread parallelization of adaptive mesh operations
Many of the world’s leading supercomputer architectures are a hybrid of shared memory and network-distributed memory. Such an architecture lends itself to a hybrid MPI-thread programming model. We first present an implementation of inter-thread message passing based on the MPI and pthread libraries. In addition, we present an efficient implementation of termination detection for communication r...
متن کاملEfficient Coherency and Synchronization Management in SCI based DSM systems
The performance of shared memory applications on Distributed Shared Memory (DSM) depends to a great part on the existence of efficient synchronization mechanisms and relaxed consistency models with a small management overhead. Using some of the core features of SCI, remote memory access and atomic transactions, highly efficient solutions for these areas have been developed and implemented withi...
متن کاملHybrid Programming with OpenMP and MPI
The basic aims of parallel programming are to decrease the runtime for the solution to a problem and increase the size of the problem that can be solved. The conventional parallel programming practices involve a a pure OpenMP implementation on a shared memory architecture (Fig. 1) or a pure MPI implementation on distributed memory computer architectures (Fig. 2). The largest and fastest compute...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998